Installing, Running and Maintaining Large Linux Clusters at CERN

نویسندگان

  • Vladimir Bahyl
  • Benjamin Chardi
  • Jan van Eldik
  • Ulrich Fuchs
  • Thorsten Kleinwort
  • Martin Murth
  • Tim J. Smith
چکیده

Having built up Linux clusters to more than 1000 nodes over the past five years, we already have practical experience confronting some of the LHC scale computing challenges: scalability, automation, hardware diversity, security, and rolling OS upgrades. This paper describes the tools and processes we have implemented, working in close collaboration with the EDG project [1], especially with the WP4 subtask, to improve the manageability of our clusters, in particular in the areas of system installation, configuration, and monitoring.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scalable Cluster Administration - Chiba City I Approach and Lessons Learned

Systems administrators of large clusters often need to perform the same administrative activity hundreds or thousands of times. Often such activities are timeconsuming, especially the tasks of installing and maintaining software. By combining network services such as DHCP, TFTP, FTP, HTTP, and NFS with remote hardware control, cluster administrators can automate all administrative tasks. Scalab...

متن کامل

Large Scale Print Spool Service

The paper describes a project to enhance the print service for CERN. The printer infrastructure consists of over 1000 printers serving more than 5000 Unix users running on workstations of various brands as well as PCs running Linux. In addition, the infrastructure must serve more than 3000 PCs running Windows/95 and NT 4. We support a large number of printer manufacturers, including HP, QMS, Te...

متن کامل

A new Distributed Security Model for Linux Clusters

With the increasing use of clusters in different domains, efficient and flexible security has now become an essential requirement for clusters, though many security mechanisms exist, there is a need to develop more flexible and coherent security mechanisms for large distributed applications. In this paper, we present the need for a unified cluster wide security space for large distributed appli...

متن کامل

A methodology for flexible species distribution modelling within an Open Source framework

2 Installing the tools 4 2.1 Installing Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Installing GRASS under Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 Installing R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.4 Installing R-pack...

متن کامل

CDE: Using System Call Interposition to Automatically Create Portable Software Packages

It can be painfully hard to take software that runs on one person’s machine and get it to run on another machine. Online forums and mailing lists are filled with discussions of users’ troubles with compiling, installing, and configuring software and their myriad of dependencies. To eliminate this dependency problem, we created a system called CDE that uses system call interposition to monitor t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره cs.DC/0306058  شماره 

صفحات  -

تاریخ انتشار 2003